Method of determining the copy number of a nucleotide sequence

ABSTRACT

The invention relates to a method of determining of accurately determining the copy number of a nucleotide sequence I in a sample using an amplification technique, such as PCR. In addition, a second nucleotide sequence II is also measured and calibration curves for each are made, from which the relative copy number CN can be determined. According to the present invention, accuracy is improved by performing multiple amplifications in a single well using real time PCR.

The present invention relates to a method of determining the copy numberof a nucleotide sequence I in a sample using an amplification technique,said method comprising the steps of

-   1) adding nucleotides, primers, polymerase and any further reagents,    if any, required for the amplification technique used to the sample,-   2) performing one or more amplification cycles to amplify the    nucleotide sequence I for which the copy number has to be    determined;-   where the sample contains a chromosomal second nucleotide sequence    II, and-   a) the first nucleotide sequence I is amplified,-   b) the second nucleotide sequence II is amplified,-   c) a third nucleotide sequence I′ corresponding to the first    nucleotide sequence I and present in a control sample is amplified    at various dilutions, and-   d) a fourth nucleotide sequence II′ corresponding to the second    nucleotide sequence II and present in a control sample is amplified    at various dilutions,-   where the ratio of the concentrations of nucleotide sequence I′ and    II′ is known-   where the amplifications of the third and fourth nucleotide    sequences I′ and II′ at various dilutions allows standard curves    SC_(i) with i being I or II to be made, the concentrations of I and    II are determined by using the respective standard curve SC_(i), and    the relative concentrations allows the relative copy number CN of    sequence I (versus nucleotide sequence II) to be determined using    the formula    ${CN} = \frac{\lbrack I\rbrack_{{SCr}^{\prime}}}{\lbrack{II}\rbrack_{{SCrr}^{\prime}}}$    where-   CN is the relative copy number of I over II in the sample; [I]_(SC)    _(I′) is the concentration of I determined using standard curve    SC_(I′); and-   1) [II]_(SC) _(II′) is the concentration of II determined using    standard curve SC_(II′).

Most eukaryotic diploid cells contain two copies of a single gene; oneon each chromosome of a pair of chromosomes. The chromosomes of a pairof chromosomes being derived from each parent, the genes may bedifferent and, for example, one of them may result in a abnormalprotein. Thus, the number of functional genes is not necessarily 2 in aneukaryote, and can be 1 or even 0. While often genes are present in onecopy per chromosome of a particular pair of chromosomes, some genes arepresent in multiple copies, for example in tandem repeat sequences.Another exception to the general rule of 2 copies per cel ismitochondrial DNA. A cell contains many mitochondria, the number beingdependant on the type of cell. But even for a particular cell type, thenumber of mitochondria may vary. Typical numbers are between 100 and1000 mitochondria per cell, and each mitochondrion contains severalcopies of mitochondrial DNA. In addition, the typical copy number is notnecessarily equal to larger than 2 per cell. Some nucleotide sequencesare very rare among cells (despite being of one and the same subject,such as a human being). This is, for example, after gene rearrangement.This is, for example, the case with antibody producing cells(B-lymphocytes) or receptor-carrying T-lymphocytes. Of a large number oflymphocytes, only a few will contain a particular nucleotide sequencedefining the variable region of a particular antibody (or of the T-cellreceptor), capable of recognizing a particular antigen. In the art, aneed exists to reliably determine the copy number of a nucleotidesequence, which may comprise the nucleotide sequence of a gene or partthereof. A method according to the preamble is known in the art.

A method according to the preamble is known disclosed by Kwok et al inU.S. Pat. No. 5,389,512.

The object of the present invention is to improve this method forreliably determining the copy number of a nucleotide sequence even if itis present in extreme amounts, such as lots of copies per cell or onlyfew copies per many cells. In addition, an object of the presentinvention is to provide a method which has reduced sensitivity to theefficiency with which DNA was extracted from the cells containing anucleotide sequence I for which the copy number has to be determined.

To this end, the method according to the present invention ischaracterized in that

-   -   at least one pair of amplification reactions chosen from i) a)        and b), and ii) c) and d) is performed in a single container and        monitored spectrophotometrically during amplification, and    -   the third nucleotide sequence I′ and fourth nucleotide sequence        II′ resides on a single vector.

This allows for a more accurate measurement of relative or absolute copynumbers of nucleotide sequence I. Suitable spectrophotometrical methodsare known in the art. More specifically, such methods rely on internalprobes for real time measurements, for example real time PCR. Internalprobes are known in the art, and are disclosed by, for example, Winer etal (Anal. Biochem 270, pp. 41-49 (1999)). Measurements can be doneeither continuously, or after finishing an amplification cycle. Whilespecific reference is made to standard curves, it goes without sayingthat this can be done using computational methods without an actualgraph being made. Hence, in the present application the phrase “making astandard curve” involves any method using at least two reference pointsto determine a (relative) concentration. Generally, all amplificationswill be performed substantially at the same time. By performing multipleamplifications in one container, the room for error is reduced. Themethod according to the invention is not only highly accurate, but it isalso very efficient if performed for multiple samples. That is, for eachnucleotide sequence I for which it is desired to determine the copynumber, only a single standard curve SC_(II′) has to be made. Withrespect to the term “corresponding” as used in the present invention inconjunction with nucleotide sequences, this is intended to mean that thenucleotide sequences I and I′ (and II and II′), or more specifically thenucleotide sequence of one and the complementary sequence of the other,are capable of hybridizing under stringent conditions. If the sequencesI and I′ (and II and II′) do not have the same length, the shortest ofthe two is preferably at most 50% shorter, more preferably at most 30%shorter.

The third nucleotide sequence I′ and fourth nucleotide sequence II′residing on the same vector allows their ratio to be constant andexactly known (for example 1:1). This allows for the most accuratemeasurements possible. It is possible to subject the vector containingboth nucleotide sequence I′ and II′ to a digestion using one or morerestriction enzymes, optionally followed by purification, to yield alinear molecule containing both both nucleotide sequence I′ and II′, andusing this molecule for the amplifications required for the standardcurves.

In the present application, a vector is understood to be any nucleotidesequence consisting of or containing the nucleotide sequence(s) to beamplified. When present on a vector capable of being replicated in vitroor in vivo, it is easy to obtain that particular nucleotide sequence indesired quantities. It is also very easy to determine the DNAconcentration and hence the copy number of the nucleotide sequence pervolume. A vector capable of replication or being replicated may be anysuch vector known in the art, such as a plasmid, a cosmid, a virus etc.If, according to a favourable embodiment, the third nucleotide sequenceI′ resides on first vector and the fourth nucleotide sequence II′resides on a second vector, the vectors can be used (or mixed) at anydesired ratio to accommodate expected differences in copy number in thesample.

Douek et al (Nature 396, pp. 690-695 (1998)) describe a method fordetecting the products of the rearrangements of T-cell receptors (TREC)using a semi-quantitative assay. For determining the amount of TREC in agiven sample, a known amount of a DNA competitor are prepared. Then, anamount of sample DNA containing the nucleotide sequence to be determinedare added to the tube. A PCR amplification reaction is carried out inthe presence of radiolabeled deoxynucleotide. Subsequently, theresulting amplification products are run on a gel to separate the sampleDNA PCR product from the competitor DNA product. After autoradiography,the amount of nucleotide sequence to be determined is calculated usingdensitometric analysis from the ratio between a band of competitor DNAand a band of the sample DNA. The result is expressed as the number ofcopies of TREC per microgram total DNA. To achieve an acceptableaccuracy, 4 tubes containing scalar amounts of competitor DNA are used,to which fixed amounts of sample DNA are added. The disadvantage of thismethod is that when DNA is extracted from cells, it must be assumed thatthis is all the DNA present in the cells. That is, it is assumed that nocell escaped lysis and all DNA present in the cells was extracted andisolated. This is not necessarily the case. Another disadvantage of thismethod is that it is sensitive to differences in amplificationefficiency.

The European patent publication EP 0 959 140 discloses a method andapparatus for determining quantities of nucleic acid sequences insamples using standard curves and amplification ratio estimates. Aplurality of standard samples each containing a known quantity of anucleic acid control sequence, and a test sample containing a knownquantity of the nucleic acid control sequence plus a nucleic acidsequence in an unknown concentration, are subjected to an amplificationreaction. The concentration of the nucleic acid present in an unknownconcentration in the test sample is determined.

The European patent publication EP 1 138 783 discloses a method for thequantification of a nucleic acid in a test sample, by determining theamplification efficiency under defined conditions, and performing thequantification of said nucleic acid under the same defined conditions,allowing correction of the concentration determined for said nucleicacid.

According to a preferred embodiment the absolute copy number isdetermined by multiplying the copy number CN by the absolute copy numberof sequence II per cell.

For several nucleotide sequences II the number of copies of per cell isknown. An example is, for example, the gene coding for heat shockprotein 70, or Fas Ligand (CD178), which are known to be present withtwo copies per cell (i.e. the absolute copy number of hsp 70=2). Manynucleotide sequences of genes are very suitable because they generallyare present in a known number of copies in every cell of the speciesfrom which the DNA is derived. The efficiency with which DNA material isextracted from the cells is not important (although, in case nucleotidesequence I is on a different molecule as nucleotide sequence II, it isimportant that they are extracted with the same efficiency). Hence, thisembodiment allows determination of the absolute copynumber of thenucleotide sequence I per cell.

According to a preferred embodiment, at least two different thirdnucleotide sequences I′ for measuring a corresponding number ofdifferent first nucleotide sequences I reside on a single vector.

In other words, a single vector, requiring its concentration to bedetermined only once, can carry multiple third nucleotide sequences I′,which allows, for example, the copy numbers of many different genes tobe determined.

Preferrably, the sequence of the first nucleotide sequence I is the sameas the third nucleotide sequence I′.

This strongly reduces errors due to differences in amplificationefficiencies between I and I′. Nevertheless, small differences innucleotide sequence are generally allowed, although changes at locationswhere the probe used for detecting the concentration of the nucleotidesequence are best avoided. In other word, it is highly preferred if theprobe is a perfect match for the sequence where it is intended to bind.

Similarly, it is preferred that the sequence of the second nucleotidesequence II is the same as the fourth nucleotide sequence II′.

While the present invention is described with reference to DNA, thepresent invention also applies to the determination of the number of RNAsequences present in a cell. Use can be made of methods known in the artto multiply RNA, for example by preparing cDNA. This application doesnot attempt to teach an interested layman how to become a person skilledin the art, for which reason the layman is referred to general textbooks and in particular to a proper university to learn the requiredtechniques that a person skilled in the art knows how to apply thesetechniques to work the present invention.

The present invention will now be illustrated with reference to thedrawings where

FIG. 1 represents a standard curve for an mtDNA sequence I′ (circles)plus data for nucleotide sequence I (squares);

FIG. 2 represents a standard curve for a nuclear DNA sequence II′(circles) plus data for nucleotide sequence II (squares);

FIG. 3 represents a standard curve for a nuclear DNA sequence I′(circles) plus data for nucleotide sequence I (squares);

FIG. 4 represents a standard curve for a nuclear DNA sequence II′ (FasL)(circles) plus data for nucleotide sequence II (squares); and

FIG. 5 shows the effect of age on the numbers of copies of TREC inperipheral lymphocytes (percentage of lymphocyte cells expressing TREC).

The method according to the invention will be illustrated using twoExamples. The first relates to the quantitive analysis of mitochondrialDNA (mtDNA) and demonstrates the technique for determining multiplecopies per cell. The second Example demonstrates the quantitativedetermination of a fractional copy number of a particular nucleotidesequence per cell.

EXAMPLE 1 Materials and Methods

Primers

The nucleotide sequence I (mtDNA) was a stretch having a length of 102nucleotides, and corresponds to part of the enzyme NADH dehydrogenase ascoded for by mtDNA. Amplification of nucleotide sequence I was performedusing a set of primers, each having a length of 21 nucleotides andsynthesized using standard procedures. The sequences of both primerswere checked to be unique for human mtDNA using Blast software, throughthe NCBI site at NIH (http://www.ncbi.nlm.nih.gov/blast/).

The nucleotide sequence II (nuclear DNA) serving as a reference, was astretch having a length of 104 nucleotides and part of the FasL gene,which comes with two copies per human cell. Amplification of nucleotidesequence II was performed using a set of primers, each having a lengthof 21 and 24 nucleotides respectively.

Probes

To monitor the progress of amplification, a probe was used fornucleotide sequence I, the probe having a length of 23 nucleotides,having a FAM (carboxy fluorescein) fluorescent probe at the 5′ end and aBlackHole Quencher1™ group at the 3′ end. This probe, and all others inthis application, was ordered commercially with MWG, Ebersberg, Germany.The sequence of the probe was checked to be unique for human mtDNA usingBlast software, through the NCBI site as mentioned above.

The probe used for nucleotide II had a length of 22 nucleotides andcontained TexasRed as the fluorescent label and and a BlackHoleQuencher2™ group at the 3′ end (MWG).

DNA isolation

DNA was isolated from HL60, a promyelocytic leukaemia cell line, using aDNA isolation kit from Qiagen, Hilden, Germany according to theinstructions of the manufacturer.

Control

A vector was constructed, using pGEM-11Z (Promega) containing thesequences I′ and II′ head to tail, using standard genetic engineeringtechniques, as all too familiar from Sambrook et al. (Molecular cloning.A lab manual. (1989)) in E. coli. The nucleotide sequences I′ and II′were identical to their respective I and II counterparts, and present onthe vector in a highly defined 1:1 ratio.

The absolute concentration of the controls was done using limitingdilution assays (Sambrook).

Amplification

Amplification was performed using an iCycler Thermal cycler (BioRad,Hercules, Calif., USA) using standard procedures. The amplification isperformed in plates having 96 wells. This instrument allows monitoringof fluoresence in up to 4 different channels. In short, one cycle ofdenaturation (95° C. for 6 min) was performed, followed by 45 cycles ofamplification (94° C. for 30 s, 60° C. for 60 s). The amplification wasperformed in a mix that consisted of: Promega PCR buffer 1× (Promega,Madison, Wis., USA), 3.0 mM MgCl2, 400 pmol of primers for mtDNA, 0.2 mMdNTP and 2 U of Taq polymerase (Promega). In accordance with theinvention, the amplification for both nucleotide sequences I and II wereperformed in a single well, and the same is true for nucleotidesequences I′ and II′ (for determining the standard curves). Data wereanalysed using the software of the iCycler.

The standard curves were made by introducing a known number of copies ofvector per well.

Amplification experiments were performed in triplicate.

Results

FIG. 1 shows the standard curve for nucleotide sequence I′ and FIG. 2shows the standard curve for the nucleotide sequence II′ based on FasL.Note the excellent correlation coefficients of 0.995 and 0.996respectively, indicating the excellent accuracy of the method accordingto the invention. Using these curves, the concentration of nucleotidesequences I and II (shown as squares in FIGS. 1 and 2) were determined.As it is known that the nucleotide sequence for FasL (and morespecifically for the probe for nucleotide II/II′) is resent with twocopies per cell, the number of copies of nucleotide sequence I per cellis twice as high, i-e. 160.

EXAMPLE 2

Basically, the same method was used as described in Example 1, exceptthat the nucleotide sequence I corresponded to part of the sequence ofthe delta locus of the T-cell receptor. The method was used to determinethe number of copies of TREC per cell, in particular peripherallymphocytes in blood, in three age groups (healthy humans of 20, 60 or100 years. The number of people were respectively 16 (10), 17 (10), and21 (17), with the number of women between parentheses)

The standard curves for nucleotide sequence I′ and II′ are shown inFIGS. 3 and 4 respectively. The following correlation coefficientsobtained were: 0.999 and 0.998.

FIG. 5 shows that the number of copies of TREC decreases with age(averages per age group shown as a horizontal line) from about 3.2 to0.1 per 100 cells.

While particularly beneficial for the method according to the presentinvention in view of the fact that spectrophotometrical methods allowsimulaneous detection of multiple labels, it is possible to perform anamplification reaction using any known amplification technique, wherethe third nucleotide sequence I′ and fourth nucleotide sequence II′resides on a single vector and the amplifications of each of I′ and II′are performed in separate containers, such as separate wells. Theapplication covers this possibility as well. Such amplificationtechniques comprise, apart from the ones mentioned above, CP (CyclingProbe Reaction), bDNA (Branched DNA amplification), SSR (Self-SustainedSequence Replication), SOA (Strand Displacement Amplification), QBR(Q-Beta Replicase), Re-AMP (Formerly RAMP), NASBA (Nucleic Acid SequenceBased Amplification), RCR (Repair Chain Reaction), LCR (Ligase ChainReaction), TAS (Transorption Based Amplification System), and HCS(amplifies ribosomal RNA).

1. A method of determining the copy number (CN) of a first nucleotidesequence I (NucSeqI) in a sample using an amplification technique, saidmethod comprising the steps of: (1) adding to the sample nucleotides,primers, polymerase and optionally, any additional reagents, requiredfor amplification, (2) performing one or more amplification cycles toamplify the NucSeqI; wherein the sample comprises a chromosome-derivedsecond nucleotide sequence II (NucSeqII), and the followingamplification steps are carried out: (a) NucSeqI is amplified, (b)NucSeqII is amplified, (c) a third nucleotide sequence I′ (NucSeqI′)corresponding to NucSeqI and present in a control sample is amplified atmultiple dilutions, and (d) a fourth nucleotide sequence II′ (NucSeqII′)corresponding to NucSeqII and present in a control sample is amplifiedat multiple dilutions, wherein the ratio of concentration of NucSeqI′and the concentration of NucSeqII′ is known, wherein amplification ofthe NucSeqI′ and NucSeqII′ at multiple dilutions results in thegeneration of standard curves SC_(I) and SC_(II), respectively, suchthat the concentrations of NucSeqI and NucSeqII are determined using therespective standard curves SC_(I) and SC_(II), such that the copy numberCN of NucSeqI relative to NucSeqII is determined using the formula${CN} = \frac{{Conc} - I_{SCI}}{{Conc} - {II}_{SCII}}$ wherein (i) CN isthe relative copy number of NucSeqI relative to NucSeqII) in the sample;(ii) Conc-I_(SCI) is the concentration of NucSeqI determined usingstandard curve SC_(I); and (iii) Conc-II_(SCII) is the concentration ofNucSeqII determined using standard curve SC_(II), and wherein: at leastone pair of amplification reactions selected from (a) and (b), and (c)and (d) is performed in a single container and monitoredspectrophotometrically during amplification, and NucSeqI′ and NucSeqII′are localized on a single vector.
 2. A method according to claim 1,wherein an absolute copy number is determined by multiplying CN by theabsolute number of copies of NucSeqII per cell.
 3. A method according toclaim 1, wherein at least two different NucSeqI′ sequences used formeasuring a corresponding number of different NucSeqI sequences arelocalized on a single vector.
 4. A method according to claim 1, whereinthe sequences of NucSeqI and NucSeqI′ are the same.
 5. A methodaccording to claim 1 wherein the sequences of NucSeqII and NucSeqII′ arethe same.
 6. A method according to claim 2, wherein at least twodifferent NucSeqI′ sequences used for measuring a corresponding numberof different NucSeqI are localized on a single vector.
 7. A methodaccording to claim 2 wherein the sequences of NucSeqI and the NucSeqI′are the same.
 8. A method according to claim 3 wherein the sequences ofNucSeqI and the NucSeqI′ are the same.
 9. A method according to claim 6wherein the sequences of NucSeqI and the NucSeqI′ are the same.
 10. Amethod according to claim 2 wherein the sequences of NucSeqII and theNucSeqII′ are the same.
 11. A method according to claim 3 wherein thesequences of NucSeqII and the NucSeqII′ are the same.
 12. A methodaccording to claim 4 wherein the sequences of NucSeqII and the NucSeqII′are the same.
 13. A method according to claim 6 wherein the sequences ofNucSeqII and the NucSeqII′ are the same.
 14. A method according to claim7 wherein the sequences of NucSeqII and the NucSeqII′ are the same. 15.A method according to claim 8 wherein the sequences of NucSeqII and theNucSeqII′ are the same.
 16. A method according to claim 9 wherein thesequences of NucSeqII and the NucSeqII′ are the same.